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We present a matrix product state (MPS) algorithm to approximate ground states of translation- 
ally invariant systems with periodic boundary conditions. For a fixed value of the bond dimension 
D of the MPS, we discuss how to minimize the computational cost to obtain a seemingly opti- 
mal MPS approximation to the ground state. In a chain of N sites and correlation length f, the 
computational cost formally scales as g(D,^/N)D s , where g(D,£/N) is a nontrivial function. For 
£ <C N, this scaling reduces to D 3 , independent of the system size N, making our algorithm N times 
faster than previous proposals. We apply the method to obtain MPS approximations for the ground 
states of the critical quantum Ising and Heisenberg spin-1/2 models as well as for the noncritical 
Heisenberg spin-1 model. In the critical case, for any chain length N, we find a model-dependent 
bond dimension D(N) above which the polynomial decay of correlations is faithfully reproduced 
^^ throughout the entire system. 
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(N| I. INTRODUCTION 

Concepts of entanglement for many-body quantum systems have recently proven useful to devise new methods 
O ! for the numerical simulation of quantum spin chains. It has been shown that the very successful density matrix 
renormalization method (DMRG) pQ can be rephrased as a variational method over the class of matrix product states 
(MPS) 0-[5]; this realization clarified the relatively poor performance of DMRG for systems with periodic boundary 
conditions (PBC), as MPS with open boundary conditions (OBC) do not have the right entanglement structure. It 
was shown in [4 how this could be cured by using a MPS with PBC. However, due to the cyclic structure of the 
underlying MPS, the computational cost of the simulation in terms of the MPS bond dimension D grew from 0(D 3 ) 
to 0{D 5 ). This was subsequently lowered to 0{D 3 ) in 0[7]. 

An important motivation to study finite chains is that one can compute bulk properties of the system in the 
thermodynamic limit by extrapolating results obtained for increasingly large chains |S]. In this context, it is relevant 
whether OBC or PBC are considered. For a finite chain with OBC, local expectation values differ from those in 
thermodynamic limit due both to finite-size effects and to boundary effects, and larger chains need to be considered. 

iy\ In contrast, with PBC only finite-size effects are present. This makes the extrapolation to the thermodynamic more 

transparent and smaller systems need to be simulated. Another important advantage of PBC is that only in this 
case a finite chain can be translation invariant (TI) [55J . This is crucial feature for the present work, where TI [23J is 

J^ exploited in order to reduce the computational costs of simulating finite chains. 

y—i Pippan, White and Evertz |7] recently showed how to simulate spin chains with PBC with an MPS algorithm whose 

computational cost given in terms of D scales like 0(D 3 ). The intuition behind this scaling can be understood if 
one first considers systems with a correlation length £ that is much shorter than the system size TV. Let us choose 
a block of sites with size I such that £ > I (see figure Ilk). In this case correlations between the left and the right 
ends of the block are mediated only through the sites inside the block. It is clear that the properties of this block are 
exactly the same as those of a block of equal length embeded in the bulk of a sufficiently large system with OBC. It 
is then not surprising that computing observables that are contained within the block has a cost proportional to D 3 , 
as in the case of OBC. This is basically due to the fact that such calculations involve contracting a tensor network 
that has, as uncorrelated left and right boundary conditions, two boundary vectors with D 2 components [5]. Now 
imagine we are interested in the description of properties contained in a larger block such that £ > I > N — £ (see 
figure Tp). This block is small enough for its ends to have correlations that are mediated via its own sites, yet large 
enough that correlations are also mediated via the sites outside the block, since now N — I < £. If these externally 
mediated correlations are relatively small, the situation is not very different from the previously described case where 
I < N — £. All we have to do is to replace the two uncorrelated boundary vectors with a low rank boundary matrix 
that contains the small amount of correlations. If the rank of the matrix is n, then the cost of this algorithm will be 
proportional to nD 3 . 

We emphasize two important aspects of the computational cost of the algorithm in Ref. [7] . The first one is that 
the cost is also proportional to the system size N, due to the usual sweeping procedure that optimizes one site at each 
instant. We will show below how, in the case of a TI chain, one can get rid of this factor [53]. This is achieved by 
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(a) Medium f , small I: equivalent to OBC 
environment. 



(b) Medium £, medium I: equivalent to 
partially correlated OBC environment. 



(c) Large £, any I: equivalent to fully 
correlated OBC environment. 



FIG. 1: (Color online). The properties of a block of size I within a PBC system can be equivalent to those of a block with same 
size in the bulk of an OBC system. Depending on I and the correlation length £, the left and right boundary conditions of the 
OBC system are more or less correlated. 



using a TI MPS, where the N tensors of the MPS are chosen to be identical. For all D, the precision of our results is 
comparable to that reported in Ref. [7|. This indicates that restricting the MPS ansatz to be TI does not lead to a 
loss of precision, while yielding a substantial reduction of the computational cost. The second one is the multiplicative 
factor n corresponding to the rank of the boundary matrix that transfers correlations between the ends of a block. 
In the case where the correlation length £ is of the order of the system size N (see figure fit) , this factor may not be 
small. In a worst case scenario, where strong correlations between distant sites would force the boundary matrix to 
be full rank, i.e. n = D 2 , the approach in Ref. [7] would not be better than the 0(D 5 ) algorithm of Ref. [4 . Thus 
for critical systems where £ « N it is a priori unclear what the overall scaling of the computational cost in D will be. 
However, in Ref. [7] it has been indicated that if D is not too large, the ground state energy of a critical spin chain 
obtained using a small constant n is satisfactory, in that its accuracy scales with D in a similar way as it would in an 
OBC chain of the same size. 

Here we shall show how to exploit TI to obtain a faster algorithm that, for instance, does not scale with TV when 
£ <C N. However, except for the case ^ < JV, we still lack a precise characterization of how the cost scales as a 
function of D and N. We benchmark the present approach by addressing both critical (i.e. £ f=a TV) and non-critical 
(i.e. £ <C N) chains. An inrportant observation is that in the case of critical systems the finite bond dimension D 
of the MPS introduces an effective correlation length ^p ps D k [blTfTT] that depending on D can be much smaller 
than the actual one. This implies that as N grows, a larger bond dimension D w N 1 ^ needs to be considered 
if correlations between distant sites of the chain with PBC are to be properly captured Our numerical results are 
consistent with a complex scenario where the cost of simulations is dominated by the crossover between finite- TV and 
finite- D corrections, as further discussed in [IB] , 

The rest of the paper is structured as follows: we start by sketching the main idea of the approach in Sect. [Hj 



followed by an in-depth presentation of the algorithm in Sect. |HI| In Sect. IV we present numerical results for the 
critical Quantum Ising and Heisenberg spin-1/2 models as well as for the non-critical Heisenberg spin-1 model. Finally 
Sect. |V] contains some conclusions. 



II. OVERVIEW 

This work is concerned with the approximation of ground states (GS) within the variational class of MPS with PBC 
defined in 4|. Since critical systems are arguably anrong the most challenging ones from a computational perspective, 
we will apply the approach to investigate critical spin chains (although non-critical chains can also be considered). 
An important restriction is that we only consider TI systems, which we will analyse with a TI MPS ansatz, namely 
an MPS where the tensors corresponding to different sites are all equal. The resulting variational class is a subclass 
of the one defined in [4 . The TI MPS with PBC reads 
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with identical matrices A t at every site. Note that since for fixed i, each Ai represents a matrix, the MPS is completely 
characterized by the three dimensional tensor A™ a =: A. Furthermore we should point out that we will mostly be 
interested in Hamiltonians that are real and reflection invariant; these symmetries can be implemented at the level 



of MPS by choosing the matrices At real and symmetric. This extra constraint does not seem to deteriorate the 
accuracy of the variational procedure. 

Since our ansatz consists of N copies of the same tensor, the energy is not a quadratic expression in the variables 
defined by the tensors A;; this implies that we cannot use the sweeping procedure described in [1] or any other 
procedure that lowers the energy by minimizing it for one site at a time. While this might seem a reason to be 
concerned at first, it will actually be the key to reducing computational costs. 

The advantages of a TI MPS ansatz (with periodicity one or two) have already been exploited in the context 
of infinitely long chains [TJ I5HT3]. Refs. [H0E] used a TI MPS in the context of infinite system DMRG. In 
Ref. [TO] , instead, a (two-site periodic) MPS approximation to ground states was obtained by imaginary time evolution. 
Refs. P21 [13] discussed how to compute ground states with a one-site TI MPS when the imaginary time evolution 
operator can be well enough approximated by layers of one-site TI matrix product operators. An attempt to adapt 
that method to finite chains with PBC yielded results that are not as accurate as one might expect [25]. Finally, we 
also point out that a TI MPS with PBC was already used in Ref. [Sj together with Monte Carlo sampling techniques, 
with a formal cost 0{ND 3 ). In that case, the use of sampling techniques reduced the cost from 0(D 5 ) to 0(D 3 ), 
but at the same time enforced the multiplicative factor N, since a TI MPS does not represent a TI state once a given 
configuration is chosen during the sampling. 

An obvious way to find the TI-MPS with minimal energy is a multidimensional minimization procedure that requires 
only evaluations of the function itself, such as the downhill simplex method [191. When no further information about 
the function is available, this is indeed the method of choice. It is extremely robust but also extremely slow. However, 
if there is a feasible way to obtain more elaborate information such as the gradient or the Hessian, there are methods 
relying on these quantities that are clearly superior in what regards the speed of convergence and the required storage 
space. 

In the following we will present an efficient algorithm to calculate the gradient of the energy V-E(a) where the 
argument a = vec(A) denotes the vector containing all entries of the MPS tensor A. The result will then be used 
by a standard numerical library conjugate gradient algorithm to find a minimum of E(a). We must emphasize that 
this minimum is by no means guaranteed to be the global one i.e. the optimal ground state approximation within 
the subspace defined by our special MPS ansatz. However, our numerical results seem to be slightly more accurate 
than previous results [7], while we have obtained a reduction in computational costs. We will illustrate the accuracy 
of this approach by applying it to two exactly solvable models in order to give exact values for the numerical errors. 

The computational cost will turn out to scale as 0(mnD 3 ) + 0(n 2 D 3 ) where D is the virtual bond dimension and m 
and n are some parameters to be specified below. Briefly speaking, the scaling can be understood as follows: first we 
approximate large powers of the MPS transfer matrix, whose exact definition will be given later in the text, within a 
reduced subspace of dimension n. Treating each of the n dimensions separately allows us to transform the contraction 
of a tensor network with PBC (which scales as 0(D 5 )) into n contractions of tensor networks with OBC (each of 
which scales as 0(D 3 )). As we will explain in more detail in the next section, the resulting tensor networks will still 
contain at most one portion represented by say m adjacent transfer matrices that is not connected to the already 
approximated one. If m is large, this second portion can again be approximated within a n-dimensional subspace 
thereby yielding the scaling 0(n 2 D 3 ). If m is small, we are forced to contract the transfer matrices one after the 
other which gives the scaling 0(mnD 3 ). 

III. THE ALGORITHM 

Let us rearrange the MPS tensor components in a vector a = vec(A) which allows us to write the energy as a 
function over the manifold of free parameters in the MPS 

(y>(a)|ffh/>(a)) = (y,(A)|g|V>(A)) 
1 ' (V(a)|V(a)) " WA)|V(A)) ' l ) 

Note that due to the constraints that the matrices are real and symmetric, the number of vector components in a has 
been reduced to ^dD(D + 1). As we will treat only spin-1/2 chains (i.e. d = 2) in this work, the variational parameter 
manifold is actually D(D + l)-dimensional. Furthermore we will denote expectation values taken with respect to the 
MPS defined by the tensor A as (0) A := (ip(A)\ O \ip(A)). 

Also note that (pi) can have local extrema as opposed to E(^>) = (\l/| if |\l/) which is a convex quantity in the 
exponentially large Hilbert space. The MPS-parametrization restricts the full parameter space to a submanifold thus 
possibly generating local extrema where all derivatives in this subspace vanish. If one uses as a starting point of the 
conjugate gradient algorithm a random vector a ran ^, the search algorithm will typically get stuck in a local minimum. 
In order to avoid getting stuck in one of these, we will choose as a starting point a vector ap of which we can be sure 



that it is close to the global minimum. This approach turns out to be very robust and fast. If we are interested in 
ground states of chains with very large N, the most natural choice for the starting vector is an MPS approximation 
of the GS of the same model in the thermodynamic limit. Note that this MPS must have exactly the same symmetry 
properties as our ansatz. It was shown in previous work 13J how to obtain this MPS and we will actually use the 
tensors computed there as starting points for the present algorithm. It is obvious why the MPS for the GS of the 
infinite chain is a good choice if one is interested in finite PBC-chains with N ^> £,(D), where £(D) is the correlation 
length induced by finite D. However, it turns out that this approach also works satisfactory for moderately large N. 
Of course, if there already is any PBC solution available, using that one as a starting point may provide a gain in 
convergence time, especially if the chain lengths are similar. 
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FIG. 2: (Color online), (a) Graphical representation of the TI PBC MPS |V>(A)) of a TI spin chain with 4 sites. Note 
the identical tensors A at each site, (b) Small perturbation SA is added to the to the MPS tensor A. (c) Norm of a state 
{ip(A)\ip(A)). (d) Expectation value of a 2-site operator e.g. (ip(A)\H a , a +i\ip(A)) . (e) The expectation value is expanded in 
powers of <5A. 



The gradient V£(a) reads explicitly 
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It turns out that this quantity can be computed efficiently First, since we assume a translationally invariant 
Hamiltonian with nearest neighbour interactions, we have 
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Hence the first term in Eih is proportional to the gradient of the energy density pe(&) — (H s ,s+i)a, Vs 6 [1,N] 
(see figure Kp). Second, we can obtain gradients such as the ones occurring in (pi) numerically at a given point a 
by expanding the differentiated quantity in powers of 5a and computing the coefficient of the linear term. Thus the 
derivative in the first term is obtained via 



PE (a + 5a) = p E (a) + <Ja[V a ^B(a')] a , =a + 0(5a 2 



(4) 



and the one in the second via 



(^(a + <Ja)|V(a + £a)) = (^(a)|V(a)> +<5a[V a < (^(a')|V(a'))] a , =a + 0(6a 2 ) 



(5) 



Let us first consider Q. This can can be computed explicitly by taking a sum of completely contracted tensor 
networks (see figure pfe). Let H e ff(A) denote the object that is obtained by removing the tensor 5 A from each term 
of pe(sl+ Sa) that is linear in 5a (see figure pBl. This is a tensor with three indices, that reshaped in vector form, 
yields the desired derivative Vps(a) = vec(JT e //(A)). The computational cost for the exact contraction of the tensor 
networks in H e ff(A) scales as 0(ND 5 ) [4]. We will give below a prescription of how this can be improved [26] to 
0(mnD 3 ) + 0(n 2 D 3 ) for chains of arbitrary lengths. Furthermore we will show how to choose the smallest possible 
parameters m and n such that no loss in precision occurs and why the scaling reduces to 0(mD 3 ) + 0(nD 3 ) in the 
case of very long chains. 
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FIG. 3: (Color online). Graphical representation of the tensor H e ff(A) and N e ff(A). 

The other piece that is necessary for the computation of V£(a), is the derivative occurring in the second term of 
(pi); this term can be obtained in a very similar way (see figure [3]). We will use the notation iV e //(A) for the object 
defined by V (i/j(a)\ip(a)) =: vec(7V e //(A)). Due to the simpler structure of the tensor network the computational 
cost here will scale as 0(nD 3 ) for arbitrary chains and as 0(D 3 ) for very long chains. 

Now let us introduce the following convention for denoting incomplete tensor networks where merely one of the 
MPS tensors is missing: (0)J£ shall henceforth denote the expectation value of the operator O with respect to the 
TI MPS defined by the tensor A, where one tensor A has removed been removed from |-0(A)) at site s. Following 

this definition, the first term in the graphical representation of H e ff(A) (see figure^]) reads (#2,3)^ • ^ a tensor has 

been removed from (-0(A)| at site s, we will denote this by underlining the site index, thus we write (O)]^. Using 
this convention we can write H e ff{A) as 
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For real Hamiltonians and real MPS this reduces of course to 
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Similar considerations hold for N e tf(A). Thus, using / to denote the identity operator, we can rewrite the gradient 
of the energy (|3| as 
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In the last part of this section we will briefly sketch how a gradient based procedure can be employed to find ground 
states of PBC chains if one is dealing with complex Hamiltoiiians and thereby complex MPS. One possibility is to use 
a gradient based algorithm that converges to a minimum of the real-valued function E : C n — > R within the complex 
manifold (n stands here for the number of independent complex parameters in the MPS) . It can be shown that in this 
case one obtains the same expression ([8]) for the gradient of the energy albeit the individual terms are now complex 
valued vectors. However, since standard library routines for gradient based search cannot minimize over complex 
manifolds, let us mention the second possibility just for the sake of completeness. Due to a = x + iy with x, y € R™, 
one can treat the energy as an analytic function over a real manifold with twice as many degrees of freedom, i.e. 
E : R 2n —} R. Similar considerations to the ones leading to (18]) yield then for the gradient 
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A. Computation of H e ff(A.) 



We introduce now a shorthand notation for the building blocks of i? e // (A) that will allow us to express it in a very 
compact way. From the graphical representation (see figure H) it should be obvious what the objects H 
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n AA-> 



T A and Ta mean; note that T denotes the MPS transfer matrix that has been repeatedly mentioned in 



the previous sections. For the sake of completeness we also give the definition of the tensor H^a explicitly in terms 
of its components: 
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FIG. 4: (Color online). Graphical representation of H AA , H A 
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Here we have used greek letters to label the virtual bonds, latin ones for the physical bonds and Einstein summation 
convention to denote contracted indices. If one combines the left-hand side indices a and a' into one big index and 
does the same for the right-hand side indices 7 and 7', it is clear that Haa represents a D 2 x D 2 -matrix. The other 
objects defined in figure R] have similar explicit definitions. H e ff(A.) now reads 
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where Tr* [. . . ] indicates that the trace is taken only with respect to the matrix multiplication of the "outer" indices of 
the "big" D 2 x £) 2 -matrices. These "big" matrices may have internal open indices that survive the Tr* [. . . ]-operation 
and make sure that H e f f (A ) is left with its tensor structure s.t. it can be later reexpressed as a vector. 

The computation of ( |11| is the bottleneck of our method. If we would compute it by straightforward matrix 
multiplication, even using the sparseness, the computational cost would scale as 0(ND 5 ). In order to improve this 
scaling, the crucial point is to realize that for large N most terms in ( 11 ) will contain high powers of T which means 



that they can be very well approximated within the subspace spanned by the dominant eigenvectors |27j of T. This 
can be easily seen if we write such factors in their eigenbasis 
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where |Ai| > | A2 1 ■ • • > | A^2 1. Obviously the subspace corresponding to the small magnitude eigenvalues is suppressed 
exponentially with s and thus can be neglected for powers s that are large enough (e.g. for s = 20 and \j^\ sw 0.1, 

\^*-\ s ~ 10~ 20 < 10~ 16 which is the machine precision of double precision floating point numbers). In these cases it 
is perfectly fine to restrict ourselves to the subspace spanned by say n dominant eigenvectors, with the parameter n 
yet to be determined. In fact, we will perform the entire computation a few times, starting with a rather small n 
and increasing it until the result does not improve any more. When this happens, we know that we have found the 
optimal n beyond which, when all other parameters are fixed, the precision does not get any better. Thus we will 
approximate large powers of the transfer matrix as 
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At this point we must remark that this approximation only works if the moduli of the transfer matrix eigenvalues |A a | 
are not concentrated around a certain point (i.e. T is not approximately proportional to unity). In that case, any 
increment of n will improve the precision and we will end up with very bad overall scaling |29j . For models where this 
behaviour occurs the algorithm presented here may be worse than contracting the tensor networks explicitly, where 
the scaling is 0(ND 5 ). In these cases the chain length N ultimately decides which method is preferable. Fortunately 
for the models treated by us, this undesirable behaviour does not occur and we end up with relatively small n beyond 
which the precision does not improve any more. 

Let us now return to (11 1. There are two different types of terms which must be treated differently The first and 
the second term under our somewhat unorthodoxly defined trace can be considered as " easy" . They are approximated 
by 
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which is computed within 0(nD 3 ) operations. This is because each contraction (Aq,| H A a |A a ) can be performed with 
cost 0(D 3 ) and this has to be done n times. 

The computationally more expensive terms are the ones under the sum over s, where two different powers of T are 
involved. We will call these terms "hard". They are approximated by 
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Here we must remark two things: i) it is not necessary to let the second index /3 run over the same range as a. It would 
be possible to choose as an upper bound a further parameter n! and also vary this one until the precision does not 
improve any more. However, since expression ( 15 ) has obviously left-right symmetry, it is sensible to assume that the 
optimal result would yield n = n' . Even if this would not be the case, due to the fact that we scan along n, convergence 
will be reached only for some n ptimai > sup{n, n'}, so we will find the lowest achievable energy anyway; ii) for very 
small or very large s either the left or the right transfer matrix segments in ( 15 ) can not be well approximated by a 



little number of eigenvalues n since the lower X a are not sufficiently suppressed by the small exponent. In the worst 
case we would have to take all D 2 eigenvalues into account, which dramatically increases the computational cost. In 
order to solve this issue we will compute these terms by exact contraction of segments of length m, which introduces 
this further parameter into our algorithm. This will be explained in more detail further below. For the moment let 
us note that depending on the magnitude of s, we can further separate the sum in (11) over the "hard" terms into 
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We call the terms over which the second sum is taken "medium-s" terms and will treat them differently from the 
"extremal-s" terms that appear in the first respectively third sum. Thus H e ff(A.) can be divided into 
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1. Computation of "extremal-s" terms 

In this section we treat the terms with small respectively large s. The first thing to remark is that for large N, if 
T s can not be well approximated within some low-dimensional subspace because s is too small, it is very likely that 
for T N -3-s ^ e approximation will work due to N — 3 — s 3> s. The same observation holds in the other direction 
if s is too large. Secondly, depending on the MPS bond dimension D and the ammount of entanglement present in 
the MPS (i.e. depending on the model one is treating), there is a certain m above which T B with s > m can be 
faithfully approximated within the n < D 2 -dimensional subspace spanned by n dominant eigenvectors. As we don't 
know anything about m a priori, we introduce it as a further parameter into our algorithm. We will scan m within 
its range [1,1/2(N — 2)] and in the end we will obtain some optimal pair (m,n). The reason why m does not go 
all the way up to N — 3 is that in order for our algorithm to scale effectively as D 3 , we must employ the dominant 
eigenvector approximation on the other half of the chain. Without it we would get the undesirable scaling 0(ND 5 ). 
The contraction (see figure foy we must perform for each term with small s thus reads 
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= Tr* 



HiiT s T A T N - 3 - s w 53 <Aa| HHT S T A \\ a ) \^ 3 - s , Vs < m 



(18) 



and can be done with computational cost 0(nD 3 ) using a sparse matrix contraction scheme. As we have to repeat 
this procedure m times, the total cost scales as 0(mnD 3 ). 
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FIG. 5: (Color online). Graphical representation of a term with small s and its approximation within the subspace spanned 
by n dominant eigenvectors of T. 

The large s terms (i.e. when N — 3 — m < s < N — 3) can be easily obtained by making use of the left-right 
symmetry of the tensor network around the point with s = (N — 2)/2. The sum over all these s turns out to be 
related to the sum over the small s terms by taking the transpose with respect to the open virtual bond indices at 
the empty site where T A sits. Thus the computational cost remains unchanged 0(mnD 3 ). 



2. Computation of "medium-s" terms 



For terms where s is neither too small nor too large, both powers of the transfer matrix (i.e. T s and T N ~ 3 ~ S ) can 
be well approximated whithin the subspace spanned by n dominant eigenvectors. The good news is that in this case 
the sum over s can be performed analitically in contrast to the "extremal-s" case where we had to compute each 
of the m terms separately. However, there is also bad news, namely that we now have an additional sum over the 
eigenvalue index stemming from the approximation of T w_3_s . Explicitly the sum over all "medium-s" reads 



N-S-m 

nN-3-s 



s—rn 
N-3~2m n 

wTr *[ E E HiiT m \\ a )\° a (\ a \T A \\e)\%- 3 - 2m - s (\p\T m ] (19) 



s=0 ct,/3=l 



^JV-2-2m _ \ JV-2-2n 



= E i^P\ H AA \^a) (K\T A \\p) A™ A™ 

a 1 A^ - A Q 

a,/3=l r 

In the first step we have shifted the summation variable s and have written the matrices T in their eigenbasis. To 
arrive from the second to the third line we have used the cyclic property of the trace to write the entire expression as 
a sum over products of scalars (actually the factor containing Ta is only a scalar with respect to our specially defined 
trace since it contains internal free indices). Furthermore we have performed the s-sum straightforwardly. 

The computational cost scales here as 0(n 2 D 3 ). This is because we have two sums going from 1 ton over terms 
that are contracted within 0(D 3 ) operations. 

B. Computation of iV e //( A) 

Our prescription for the computation of N e ff (A) is also based on the observation that big powers of the transfer 
matrix T can be very well approximated within the subspace spanned by the dominant eigenvectors. However here 
things are much easier than for H e ff(A.). This is because the translational invariance is not broken by the 2-site 
Hamiltonian (see figure pi) and we can write 

N eff (A) = 2N.(I)® . (20) 

Similarly to (-ffi^A m (14 1, (-Oa * s approximated by 



{I) [J] = Tr * \ TaT n-i] a £ (\ a \T A \X a ) A^ 1 (21) 



a=l 

which is computed within 0(nD 3 ) operations. 

C. Overall scaling of the computational cost 

We have seen that the gradients in (I3J) can be obtained within 0(mnD 3 ) + 0(n 2 D 3 ) respectively 0(nD 3 ) operations 
if our approximation of large powers of the transfer matrix is justified. It is easy to check that the scalar expectation 
values in ([3]) can be obtained in an analogue yet simpler way. The fact that there are no vacant sites in the corre- 
sponding tensor networks enables us to use everywhere a method identical to the one used for N e ff(A). Thus the 
computational cost for our algorithm scales as its most expensive part, namely as 0(mnD 3 ) + 0{n 2 D 3 ). 

It is also not difficult to check that for very large chains (i.e. either when N 3> £ for non-critical systems or N 3> £d 
for critical ones, where £,d is the effective correlation length induced by finite D) this scaling can be improved. First 
recall that we had in every tensor network at least one portion of the chain expressed as a power of T that we 
approximated using its dominant eigenvectors. Now, for any bond dimension D there exists an N above which all 
approximated portions are long enough s.t. all eigenvalues except the largest one are suppressed by the very large 
exponent. In this case the overall scaling is 0(mD 3 ) + 0(nD 3 ). Note that in the scaling for the "extremal-s" terms 
we can not get rid of m because there will always be short portions between the -ffi^ and the vacant site, that must 
be contracted exactly. Similarly, for the "medium-s" terms (191 only the combinations of A^Ao 1 where both a and (3 
are large will be negligible. Factors like A™A^ must usually always be taken into account. In any case, the ultimate 
check whether our approximations are justified must be done in the simulations, where one must verify if there exists 
an n beyond which the ground state [30] energy does not decrease. 

We would like to compare our scaling of the computational cost to the one of [7] once again. Note that expressed 
in the terms used in this work, the scaling from Ref. [7 is 0(NnD 3 ). On one hand, as previously mentioned, our TI 
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algorithm yields an improvement of one factor N. On the other hand there is an additional factor n that appears 
in our scaling. This is due to the fact that we compute the gradient of the energy explicitly. It is easy to see that 
the computational cost for the evaluation of the energy itself is 0(nD 3 ). However if we would restrict ourselves to 
evaluations of the energy only, we would have to use something like a downhill simplex method as the outer function 
that scans the MPS manifold for the energy minimum. In this case the outer function would call the energy evaluator 
a huge number of times, thereby yielding the overall cost much higher than one factor of n that we must pay when 
computing the gradient. 

IV. NUMERICAL RESULTS 

We have studied both critical and non-critical nearest neighbour interaction spin models. The first one is the 
Quantum Ising model for spins-1/2 

h is = - y, z i z i - B Y, X * ( 22 ) 

which we have simulated at its critical point B = 1. The second one is the antiferromagnetic Heisenberg model 

H HB = - J2 ( X * X i + Y ^ + Z i Z i) ■ ( 23 ) 

This model is critical for spin-1/2 chains but non-critical for spin-1 chains. We have studied both cases. Note that 



(23) is not very well suited for the description with 1-site TI MPS due to its antiferromagnetic character. In order to 
cure this problem we apply in the case of the spin-1/2 chain a global unitary consisting of Pauli-y matrices on each 
second site [3T]. This leaves the spectrum unchanged and after we have found the 1-site TI MPS for the ground state, 
we can recover the one for the unchanged Hamiltonian by a new application of the global unitary. The resulting MPS 
is then of course 2-site TI. The rotated Heisenberg Hamiltonian reads 



1 






Hhb = ^}^ {-XiXj + YiYi - ZiZ-) . (24) 



2 ^ 



A. Critical systems 



Let us illustrate the strategy for the scan of the parameter space spanned by {to, n} on the basis of results obtained 
for small critical chains of 100 and 400 sites. Figure [6] and figure [7] show the relative precision A re iEo(m,n) = 
^■gexact _ fiMPS ^ m ^ n ^ j ■ jgexact £ ^g MPS ground state energy compared to the exact solution as a function of the 
algorithm parameters m and n for the Quantum Ising respectively Heisenberg chain. The first observation is that there 
exist m max and n max s.t. for all m > m max , n > n max the precision does not improve any more. In the featured plots 
the plateau V with minimal energy is reached within the plot range. The optimal point {m opt , n op t} is then the point of 
V that minimizes the scaling of the computational cost 0(mnD 3 )+0(n 2 D 3 ) i.e. {m op t, n opt } = min \{m,n}ev( mn + n2 )- 
Clearly, the optimal parameters m opt and n opt will be different for different models and different values of the chain 
length N and the MPS bond dimension D. 

The plots reveal a further detail: if we are not very pedantic about the optimal {to, n}-pair, it is not necessary to 
scan the entire plane, which is computationally very expensive. If we are willing to settle for any pair {m, n} that 
yields maximal precision, we can scan along any line n = km and we can be sure that at some point we will hit V '. 
This pair is quasi-optimal in the sense that we have found the optimal n for the corresponding m and vice versa. 
This is due to the fact that for any point of V, especially for its boundary, walking along lines with increasing m or 
n does not take us out of V . As one can see in figure [6] and [7J V is roughly symmetric in m and n, so a sensible line 
to scan along is given by n = m (32j . As we have mentioned before, our algorithm allows us to increase m only up to 
(N — 2)/2. If until then, the results obtained along n = m have not converged yet, we must continue the scan along 
the line given by the constant maximal m towards larger n. 

The relative precision of the MPS ground state energy for such line scans is plotted in figure [8] We notice that 
with increasing D the maximally reachable precision gets better in concordance to what one would expect. The fact 
that m op t and n op t increase with D is also intuitive. What is a bit surprising is that for small n the results obtained 
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FIG. 6: (Color online). Critical Quantum Ising chain with N — 100: Relative precision of the MPS ground state energy as 
compared to the analytical result as a function of the parameters (m,n) for D = 16 (left) and D = 32 (right). 
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FIG. 7: (Color online). Critical Heisenberg chain with N = 100: Relative precision of the MPS ground state energy as 
compared to the analytical result as a function of the parameters (m,n) for D = 16 (left) and D — 32 (right). 

for small bond dimensions are either similar or even better than the ones obtained for higher bond dimensions. This 
means that if one is not willing to go to larger values of n, there is no point in increasing D\ 

Another interesting point is that for fixed D, as we increase N, the plateau V is reached sooner and sooner (i.e. for 
smaller values of n and implicitly of m) . This behaviour is due to the fact that with increasing N the weight that we 
loose in our contracted tensor network by choosing n < D 2 becomes negligible at smaller n. 



B. Observables - energy and correlation functions 



As the computational cost of our algorithm actually decreases if we increase the number of sites N while keeping 
D constant, we can investigate PBC chains of arbitrary size [33]. Figure[9]and figure |l0| show the relative precision of 
the ground state energy for the critical Quantum Ising respectively Heisenberg model as a function of the MPS bond 



12 




200 



10 



10 



10" 



10 



10" 



10 






- D=12 


— >^- 


-D=16 




- D=20 


— >f— 


- D=24 




- D=28 


— >f— 


- D=32 



20 



40 



60 



80 



100 



FIG. 8: (Color online). Critical Quantum Ising chain with N = 100 (left) and TV = 400 (right): Relative precision of the MPS 
ground state energy as a function of the parameter n for different bond dimensions D. The scan has been performed along the 
line m = 5n up to the maximal value of m and then along the line with constant m = (N — 2)/2. 
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FIG. 9: (Color online). Critical Quantum Ising model: relative precision of the MPS ground state energy for different N as a 
function of D. 



dimension D. We can see that generally the relative error is decreasing as a polynomial of D i.e. A re iE (D) oc D~^. 
We have fitted straight lines through the reliable [33] data of the N = 100 and TV = 5000 plots and have obtained 
for the exponent \x the values 7.84 and 3.21 (6.12 and 2.52) for the critical Quantum Ising (Heisenberg) model. In 
the central plots (i.e. N = 500 and N — 1000) one can distinguish between two regions where the relative precision 
is decaying polynomially with the exponents obtained from the outer plots (i.e. N = 100 and TV = 5000). We have 
emphasized this by drawing dashed lines through the data points in the central plots. Note that the dashed lines are 
not fitted, they have merely the same slope as the full lines in the outer plots. This behaviour can be best understood 
if one looks at correlation functions. 

Let us first consider the critical Quantum Ising model. In figure [TT] we have plotted the ZZ and the XX correlation 
functions T zz (Ar) and T xx (Ar) [3Sj in the MPS ground state of a chain with N = 500 sites. The solid line represents 
the exact solution obtained by applying the programme of 20 to the Quantum Ising model with PBC. One can clearly 
see that with increasing D the MPS correlations become more and more accurate, just as one would expect. Note 
that we have only plotted the correlation functions for separations Ar < N/2. This is because due to the periodic 
boundary conditions T(Ar) is symmetric around Ar = N/2 [36] . We would like to point out that while the exact 
r(Ar) is linear for small Ar thus implying polynomial decay of correlations in that regime, it flattens out towards 
Ar f=a N/2. This behaviour is consistent with the physical requirement that the correlation function is smooth at 
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FIG. 10: (Color online). Critical Heisenberg model: relative precision of the MPS ground state energy for different N as a 
function of D. 
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FIG. 11: (Color online). Correlation functions for a critical Quantum Ising chain with N = 500. Left: 
correlator T zz (Ar) and as inset the half-chain correlator as a function of D. Right: correlator F xx (Ar) 
half-chain correlator as a function of D. 



order parameter 
and as inset the 



Ar = N/2. The insets show the value of the half-chain correlators T N / 2 (D) := T(Ar = N/2,D) as a function of 
D. One can clearly see a jump in 1^/2 (-D) at some D'. This means that in this model, if one wants to obtain good 
approximations for long range correlations in the ground state, one must use MPS with bond dimension D > D'(N). 
Note that the jump in the inset of figure ITT] occurs roughly in the same region as the change of the slope in the second 
plot of figure [9] This allows us to understand why in figure [9] the slope for large D is steeper than the one for small D: 
if D is not large enough such that correlations are faithfully reproduced throughout the entire chain, this represents 
a further source of error besides the inherent error of MPS with non-exponential bond dimension (i.e. D <C d N ^ 2 ). 

The absolute value of the correlation functions [37] for the critical Heisenberg chain with N = 500 sites can be 
found in figure 12 Note that these plots only contain the MPS data since we do not have analytical expressions for 
the long range correlations. Qualitatively figure |i~2] shows the same behaviour as figure [TT] Quantitatively we can see 
that correlation functions converge at much larger D than in the case of the critical Quantum Ising model, which is 
exactly what we would expect. The half-chain correlators T N / 2 (D) exhibit a more or less continuous transition to the 
region where correlations are faithfully reproduced. 

We would like to make an inte rest ing final remark regarding the error in the correlation functions as a function 
of Ar. In the left part of figure 
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for different D in the regime where 



the half-chain correlators have wellconverged (i.e. D > 25). The surprising thing is that the error does not grow 
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FIG. 12: (Color online). Absolute value of the correlation functions for a critical Heisenberg chain with N — 500. Left: 
correlator T zz (Ar) and as inset the half-chain correlator as a function of D. Right: correlator F xx (Ar) and as inset the 
half-chain correlator as a function of D. 
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FIG. 13: (Color online). Relative precision of correlation functions in the MPS ground state of the Quantum Ising model with 
N = 500. Left: error of the order parameter correlation function Y zz (Ar) for several different D in the high precision regime. 
Right: relative precision of the half-chain correlators TfjJ 2 and F xx , as a function of the MPS bond dimension D. 



monotonically as a function of Ar as one would expect, but that it rather oscillates around zero. Nevertheless the 
amplitude of the oscillations is growing monotonically with Ar. The right part of figure [13] reveals that similary to the 
relative error of the ground state energy, the relative error of the half-chain correlators A re iT N / 2 {D) obeys power-law 
decay as a function of D in the large D regime. 

Our numerical analysis thus indicates that for each N there is a minimum value of D — D'(N) such that correlations 
throughout the entire chain are properly captured. As investigated in |18j , for critical systems this minimum value of 
D'(N) is seen to be given by a small power of Af that depends on the universality class of the model. This dependence 
will allow us in (18j to characterize the cost of the algorithm presented in this work as a power of N. For the moment 
we will settle for a scaling of the overall computational cost of 0(g(D,£/N)D 3 ) where g{D,^/N) will be seen to 
become trivial only for non-critical systems. 
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C. Non-critical systems 
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FIG. 14: (Color online). Spin-1 Heisenberg chain with N = 100: Relative precision of the MPS ground state energy as 
compared to the best numerical approximation as a function of the parameters (m, n) for D — 16 (left) and D — 100 (right). 

We have seen that for critical systems it is quite involved to predict the computational cost of MPS algorithms that 
find the optimal approximation of the ground state within the manifold defined by MPS with fixed bond dimension 
D. This turns out to be much easier for non-critical systems where the correlation length £ is much smaller than the 
chain length N. We have studied the spin-1 Heisenberg chain as the prototype of a non-critical quantum spin chain 
in order to be able to compare our results with the ones presented in Ref. [7|. As pictured in figure [IIJ for N = 100 
and D that is not too big, n = 4 is sufficient in order to obtain the optimal MPS approximation to the ground state. 
This is in agreement with the predictions of Ref. [7 |. However for D as big as 100, we would have to choose n = 7 
if we are not willing to loose any precision. This indicates a dependence of n on D which is much weaker than in 
the case of critical systems. Since due to finite computer memory we cannot increase D arbitrarily, it is safe to say 
that for systems where £ <C N, n is given by a small constant. This is exactly what happens for a spin-1 Heisenberg 
chain with 100 sites since as shown in Ref. [ST] the correlation length is roughly £ sw 6 s.t. ^<JV. It is obvious from 
figure [14] that m can be chosen arbitrarily so we can fix it to m — 1. Thus in this case the cost of our algorithm 
scales like 0(D 3 ) which is indeed by a factor N less than the cost from [7]. Nevertheless we must emphasize that for 
systems where the condition £ -c N is not fulfilled anymore, the picture of a small constant n breaks down and the 
characterization of the computational cost becomes non-trivial. 

In figure 15 we have plotted the relative energy precision and the correlation functions as functions of D. Note 
that for the "exact" ground state energy density we have used Eq = —1.401484039 which is the value obtained by 
an extrapolation of our own finite D results to infinite D. We have done this since the ground state energy that we 
obtain for D — 100 is smaller than any other value we have found in the literature, and in particular smaller than the 
one used as the "exact" ground state energy in Ref. [7j. 

The correlation functions plotted in figure 15 show non-trivial behaviour around Ar rj N/2 where they clearly 
deviate from exponential decay. The half-chain correlator plotted in the inset seems to converge as a function of D 
but we do not have compelling evidence for that. 



CONCLUSIONS 



We have demonstrated the performance of a gradient based algorithm for the simulation of TI spin chains with 
PBC both for critical and non-critical systems. For critical systems where the correlation length is of the order of the 
system size, the overall scaling of the computational cost is 0(mnD 3 ) + 0(n 2 D 3 ) and we have given an analysis of 



the parameter space {m, n} with a prescription of how to obtain a quasi-optimal pair {m 



t}. In the special case 



Opt 7 ^Opt 

of a critical system that is simulated by MPS with comparatively small D, such that £d -c N holds for the induced 
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FIG. 15: (Color online). Spin-1 Heisenberg chain with N = 100: Left: relative precision of the MPS ground state energy as a 
function of D. Right: absolute value of the correlation functions Y zz (Ar) and as inset the half-chain correlator as a function 
of D. 

correlation length, the overall scaling is given by 0(mD 3 ) + 0(nD 3 ). For non-critical systems with a correlation 
length that is much smaller than the system size, increasing D barely affects the parameters m and n and we can 
write for the overall scaling 0(D 3 ). In the last two cases the cost is one factor N less than the one of the algorithm 
presented in Ref. [7J. However, for critical systems in the large- D regime, the cost of Ref. |7J is improved merely by 
a factor N/n due to the appearence of n 2 in the scaling of our algorithm. The precision of our numerical results is 
comparable with or even better than that of previous algorithms with the same bond dimension. With a TI MPS 
approximation of the ground state at hand it is possible to develop efficient MPS algorithms for the computation of 
excitations in TI systems. 
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[24] For chains where (< iV the cost will not depend on N at all. If £ « N on the other hand the cost will contain a factor 
that is smaller N but is nevertheless an artifact thereof. 

[25] This is basically due to the fact that the bond dimension truncation method used in [121 113) can be shown to be optimal 
(in a certain sense) only for infinitely long chains. We have used a straightforward adaptation of that method for finite 
chains with PBC and the results are between one and a few orders of magnitude worse than the ones obtained by the 
gradient method described in this work. 

[26] This is actually only an approximation of the exact contraction. However, due to finite machine precision, there is effectively 
no difference between both results. 

[27] Normally one denotes the eigenvector corresponding to the eigenvalue with the largest magnitude as the dominant eigen- 
vector. Accordingly, the obvious meaning of the plural (i.e. dominant eigenvectors) would be to denote the eigenvectors of 
a degenerated dominant eigenvalue. However, we rather use the term dominant eigenvectors in order to refer to a set of 
eigenvectors whose corresponding eigenvalues have the largest magnitude among all eigenvalues. 

[28] Obtaining this eigenbasis does not spoil the overall computational cost as it scales better than the contractions of the 
tensor networks. Due to the sparse structure of T, one can obtain its n dominant eigenvectors with 0(nD 3 ) operations. 

[29] In the extremal case of optimal n = D 2 the overall scaling becomes 0(D 7 ). 

[30] I.e. the state with the lowest energy which we can achieve within the constrained MPS-approximation that is used. 

[31] For the spin-1 chain we must apply the operator M = exp(J7rY) on every second site in order to obtain the same effect. 

[32] In practice it might be better to choose k < 1 since there are parts of the algorithm with the scaling 0(nD 3 ) multiplied 
by a big constant factor. In our simulations we have used k = 1/5. 

[33] However the precision is getting worse if we increase the chain length without increasing D. 

[34] If D is too large for a given chain length N, the optimal parameter n can get close to its maximal value i.e. n ~ D 2 . In 
these cases the line scan described in section |IV A| converges at moderate n only due to finite machine precision. However, 
the precision of the MPS that is obtained in this way is not the one that is theoretically maximally achievable with an 
MPS of bond dimension D. We emphasize that with infini te m achine precision the line scan with converge only close to 
= D 2 and also the large D points in figure |9J and figure 10 would lie roughly on the line corresponding to polynomial 



n 



decay. 

[35] T zz (Ar) = (Z r Z r+ Ar) - (Zr)(Z r+ Ar), V xx ( Ar) = {X r X r+Ar } - (X r )(X r+Ar ). 

[36] This holds for even N. In the case of odd N we have T((A - i)/2) = T({N + i)/2),\/i 6 {1,3,5, . . . , N - 2}. 
[37] Due to the antiferromagnetic nature of the Heisenberg model the groundstate correlation function is changing its sign from 

site to site. 



